Skip to content

Conversation

@esmeetu
Copy link
Member

@esmeetu esmeetu commented Nov 4, 2025

Signed-off-by: esmeetu [email protected]

Purpose

Test Plan

Test Result

In addition to fixing stream block when setting "VLLM_DEBUG_LOG_API_SERVER_RESPONSE": "true", i also optimize the response logging format:

Log before:

(APIServer pid=2143672) INFO 11-04 23:23:48 [api_server.py:1580] response_body={streaming_complete: content='<think>
(APIServer pid=2143672) INFO 11-04 23:23:48 [api_server.py:1580] Okay, the user just said "hi." I need to respond appropriately. Let me think. Since they didn't ask a specific question, a simple greeting like "Hi there!" works. I should make sure the response is friendly and open-ended. Maybe add an emoji to keep it warm. Let me check if there's any cultural nuance I should consider, but since they didn't mention anything, just a standard greeting is fine. Alright, time to send that.
(APIServer pid=2143672) INFO 11-04 23:23:48 [api_server.py:1580] </think>
(APIServer pid=2143672) INFO 11-04 23:23:48 [api_server.py:1580] 
(APIServer pid=2143672) INFO 11-04 23:23:48 [api_server.py:1580] Hi there! 😊 How can I assist you today?', chunks=114}

Log after:

[api_server.py:1574] response_body={streaming_complete: content='<think>\nOkay, the user just said "hi". I need to respond appropriately. Let me think. Since they\'re greeting me, I should acknowledge it. Maybe say "Hi!" and offer help. Keep it friendly and open-ended. Let them know I\'m here to assist. Make sure the response is simple and polite. No need for any complicated phrases. Just a standard greeting and offer to help.\n</think>\n\nHi! How can I assist you today? 😊', chunks=98}

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a critical issue where enabling response logging would block streaming responses. The approach of wrapping the response iterator in an async generator is well-implemented and resolves the problem effectively. Additionally, the logging format for responses has been improved for better readability. My review includes a suggestion to further enhance the new logging function for better maintainability and robustness.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: esmeetu <[email protected]>
Signed-off-by: esmeetu <[email protected]>
@chaunceyjiang
Copy link
Collaborator

/cc @lengrongfu PTAL.

Copy link
Member

@markmc markmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for you contribution!

I would like to see the bugfix and optimization as separate PRs

It took me quite some time to realize the bugfix is simply this:

                         logger.info(
                             "response_body={streaming_complete: "
-                            "content='%s', chunks=%d}",
+                            "content=%r, chunks=%d}",
                             full_content,
                             chunk_count,

Let's make this PR a (highly valuable!) one line bugfix

This streaming code is quite tricky, so I would prefer to fix the bug first, then make the more substantial optimization changes. And in the new PR with the optimization changes, it would be very helpful to include an explanation as to why the changes are an improvement.

@esmeetu
Copy link
Member Author

esmeetu commented Nov 6, 2025

@markmc Thanks for the review!
That makes sense — I’ll split the bugfix and the optimization into two separate PRs.

@esmeetu esmeetu changed the title [Frontend] Fix stream block and log format when enable response logging [Frontend] Fix logging format when enable response logging Nov 6, 2025
Signed-off-by: esmeetu <[email protected]>
@esmeetu esmeetu requested a review from markmc November 6, 2025 13:36
logger.info(
"response_body={streaming_complete: "
"content='%s', chunks=%d}",
"response_body={streaming_complete: content=%r, chunks=%d}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a nitpick - but the single quotes surrounding the content does aid readability, so let's add those back? 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — adding quotes manually will actually result in double quotes like:

response_body={streaming_complete: content=''...''}

Using %r already preserves the quotes automatically for readability.

response_body={streaming_complete: content='...'}

@markmc markmc added ready ONLY add when PR is ready to merge/full CI is needed and removed frontend labels Nov 6, 2025
@mergify mergify bot added the frontend label Nov 6, 2025
@esmeetu esmeetu self-assigned this Nov 6, 2025
@esmeetu esmeetu enabled auto-merge (squash) November 6, 2025 16:04
@esmeetu esmeetu merged commit d1dd5f5 into vllm-project:main Nov 6, 2025
51 of 52 checks passed
ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants